Search CORE

16 research outputs found

Exploring the Relationship between Membership Turnover and Productivity in Online Communities

Author: Cunningham Pádraig
Qin Xiangju
Salter-Townshend Michael
Publication venue
Publication date: 30/01/2014
Field of study

One of the more disruptive reforms associated with the modern Internet is the emergence of online communities working together on knowledge artefacts such as Wikipedia and OpenStreetMap. Recently it has become clear that these initiatives are vulnerable because of problems with membership turnover. This study presents a longitudinal analysis of 891 WikiProjects where we model the impact of member turnover and social capital losses on project productivity. By examining social capital losses we attempt to provide a more nuanced analysis of member turnover. In this context social capital is modelled from a social network perspective where the loss of more central members has more impact. We find that only a small proportion of WikiProjects are in a relatively healthy state with low levels of membership turnover and social capital losses. The results show that the relationship between social capital losses and project performance is U-shaped, and that member withdrawal has significant negative effect on project outcomes. The results also support the mediation of turnover rate and network density on the curvilinear relationship

arXiv.org e-Print Archive

Research Repository UCD

Irish Universities

Association for the Advancement of Artificial Intelligence: AAAI Publications

Distributed Bayesian Matrix Factorization with Limited Communication

Author: Blomstedt Paul
Kaski Samuel
Leppäaho Eemeli
Parviainen Pekka
Qin Xiangju
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Bayesian matrix factorization (BMF) is a powerful tool for producing low-rank representations of matrices and for predicting missing values and providing confidence intervals. Scaling up the posterior inference for massive-scale matrices is challenging and requires distributing both data and computation over many workers, making communication the main computational bottleneck. Embarrassingly parallel inference would remove the communication needed, by using completely independent computations on different data subsets, but it suffers from the inherent unidentifiability of BMF solutions. We introduce a hierarchical decomposition of the joint posterior distribution, which couples the subset inferences, allowing for embarrassingly parallel computations in a sequence of at most three stages. Using an efficient approximate implementation, we show improvements empirically on both real and simulated data. Our distributed approach is able to achieve a speed-up of almost an order of magnitude over the full posterior, with a negligible effect on predictive accuracy. Our method outperforms state-of-the-art embarrassingly parallel MCMC methods in accuracy, and achieves results competitive to other available distributed and parallel implementations of BMF.Comment: 28 pages, 8 figures. The paper is published in Machine Learning journal. An implementation of the method is is available in SMURFF software on github (bmfpp branch): https://github.com/ExaScience/smurf

arXiv.org e-Print Archive

Aaltodoc Publication Archive

Pedestrian Counting Based on Piezoelectric Vibration Sensor

Author: Hou Weiyan
Hussain Shabir
Qin Xiangju
Weis Torben
Yu Yang
Publication venue
Publication date: 01/02/2022
Field of study

Pedestrian counting has attracted much interest of the academic and industry communities for its widespread application in many real-world scenarios. While many recent studies have focused on computer vision-based solutions for the problem, the deployment of cameras brings up concerns about privacy invasion. This paper proposes a novel indoor pedestrian counting approach, based on footstep-induced structural vibration signals with piezoelectric sensors. The approach is privacy-protecting because no audio or video data is acquired. Our approach analyzes the space-differential features from the vibration signals caused by pedestrian footsteps and outputs the number of pedestrians. The proposed approach supports multiple pedestrians walking together with signal mixture. Moreover, it makes no requirement about the number of groups of walking people in the detection area. The experimental results show that the averaged F1-score of our approach is over 0.98, which is better than the vibration signal-based state-of-the-art methods.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

The influence of network structures of Wikipedia discussion pages on the efficiency of WikiProjects

Author: Cunningham Pádraig
Qin Xiangju
Salter-Townshend Michael
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

The proliferation of online communities has attracted much attention to modelling user behaviour in terms of social interaction, language adoption and contribution activity. Nevertheless, when applied to large-scale and cross-platform behavioural data, existing approaches generally suffer from expressiveness, scalability and generality issues. This paper proposes trans-dimensional von Mises-Fisher (TvMF) mixture models for L2 normalised behavioural data, which encapsulate: (1) a Bayesian framework for vMF mixtures that enables prior knowledge and information sharing among clusters, (2) an extended version of reversible jump MCMC algorithm that allows adaptive changes in the number of clusters for vMF mixtures when the model parameters are updated, and (3) an online TvMF mixture model that accommodates the dynamics of clusters for time-varying user behavioural data. We develop efficient collapsed Gibbs sampling techniques for posterior inference, which facilitates parallelism for parameter updates. Empirical results on simulated and real-world data show that the proposed TvMF mixture models can discover more interpretable and intuitive clusters than other widely-used models, such as k-means, non-negative matrix factorization (NMF), Dirichlet process Gaussian mixture models (DP-GMM), and dynamic topic models (DTM). We further evaluate the performance of proposed models in real-world applications, such as the churn prediction task, that shows the usefulness of the features generated

Research Repository UCD

Irish Universities

Oxford University Research Archive

Online Trans-dimensional von Mises-Fisher Mixture Models for User Profiles

Author: Cunningham Pádraig
Qin Xiangju
Salter-Townshend Michael
Publication venue: Journal of Machine Learning Research
Publication date: 01/01/2016
Field of study

The proliferation of online communities has attracted much attention to modelling user behaviour in terms of social interaction, language adoption and contribution activity. Nevertheless, when applied to large-scale and cross-platform behavioural data, existing approaches generally suffer from expressiveness, scalability and generality issues. This paper proposes trans-dimensional von Mises-Fisher (TvMF) mixture models for L2 normalised behavioural data, which encapsulate: (1)a Bayesian framework for vMF mixtures that enables prior knowledge and information sharing among clusters, (2) an extended version of reversible jump MCMC algorithm that allows adaptivechanges in the number of clusters for vMF mixtures when the model parameters are updated, and (3)an online TvMF mixture model that accommodates the dynamics of clusters for time-varying user behavioural data. We develop efficient collapsed Gibbs sampling techniques for posterior inference,which facilitates parallelism for parameter updates. Empirical results on simulated and real-world data show that the proposed TvMF mixture models can discover more interpretable and intuitive clusters than other widely-used models, such as k-means, non-negative matrix factorization (NMF), Dirichlet process Gaussian mixture models (DP-GMM), and dynamic topic models (DTM). Wefurther evaluate the performance of proposed models in real-world applications, such as the churn prediction task, that shows the usefulness of the features generated.Science Foundation Irelan

Research Repository UCD

Irish Universities

Oxford University Research Archive

Learning from data streams with only positive and unlabeled data

Author: Li Chen
Li Xue
Qin Xiangju
Zhang Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2013
Field of study

Many studies on streaming data classification have been based on a paradigm in which a fully labeled stream is available for learning purposes. However, it is often too labor-intensive and time-consuming to manually label a data stream for training. This difficulty may cause conventional supervised learning approaches to be infeasible in many real world applications, such as credit fraud detection, intrusion detection, and rare event prediction. In previous work, Li et al. suggested that these applications be treated as Positive and Unlabeled learning problem, and proposed a learning algorithm, OcVFD, as a solution (Li et al. 2009). Their method requires only a set of positive examples and a set of unlabeled examples which is easily obtainable in a streaming environment, making it widely applicable to real-life applications. Here, we enhance Li et al.’s solution by adding three features: an efficient method to estimate the percentage of positive examples in the training stream, the ability to handle numeric attributes, and the use of more appropriate classification methods at tree leaves. Experimental results on synthetic and real-life datasets show that our enhanced solution (called PUVFDT) has very good classification performance and a strong ability to learn from data streams with only positive and unlabeled examples. Furthermore, our enhanced solution reduces the learning time of OcVFDT by about an order of magnitude. Even with 80 % of the examples in the training data stream unlabeled, PUVFDT can still achieve a competitive classification performance compared with that of VFDTcNB (Gama et al. 2003), a supervised learning algorithm

University of Queensland eSpace

Pedestrian Counting Based on Piezoelectric Vibration Sensor

Author: Hou Weiyan
Hussain Shabir
Qin Xiangju
Weis Torben
Yu Yang
Publication venue: 'MDPI AG'
Publication date: 01/02/2022
Field of study

Helsingin yliopiston digitaalinen arkisto

Pedestrian Counting Based on Piezoelectric Vibration Sensor

Author: Shabir Hussain
Torben Weis
Weiyan Hou
Xiangju Qin
Yang Yu
Publication venue: 'MDPI AG'
Publication date: 12/02/2022
Field of study

Multidisciplinary Digital Publishing Institute